feat: Support build custom flashinfer#1886
Conversation
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
📝 WalkthroughWalkthroughThis PR adds support for building custom FlashInfer integration by introducing a new bash script that automates cloning, validation, and configuration of FlashInfer repositories. The script is integrated into the Docker build process via a conditional Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~15 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@docker/Dockerfile`:
- Around line 109-111: The Dockerfile uses the shell variable
BUILD_CUSTOM_FLASHINFER in the conditional block but never declares it as a
build ARG, so add an ARG declaration (e.g., add "ARG BUILD_CUSTOM_FLASHINFER" or
"ARG BUILD_CUSTOM_FLASHINFER=0") above the if-statement to ensure the build-arg
is accepted and propagated; keep the existing conditional that calls
tools/build-custom-flashinfer.sh when BUILD_CUSTOM_FLASHINFER is non-empty so
builds can enable this via --build-arg BUILD_CUSTOM_FLASHINFER=1.
In `@tools/build-custom-flashinfer.sh`:
- Around line 38-41: The script currently forces SSH host key verification off
via GIT_SSH_COMMAND which is insecure; change it to only disable verification
when an explicit opt-in env var is set (e.g. ALLOW_INSECURE_SSH=true or
INTERNAL_BUILD=true) and only for SSH-style URLs (check GIT_URL starts with
"ssh://" or matches an SSH host:path pattern) before setting GIT_SSH_COMMAND;
otherwise run a normal git clone to preserve host key checking. Update the logic
around GIT_SSH_COMMAND, GIT_URL and BUILD_DIR so the insecure options are
conditional on the env var and SSH URL detection.
- Around line 58-79: The script fails because project is never defined and it
blindly appends into vllm causing duplicates; fix by initializing project =
doc.setdefault("project", {}) then opt =
project.setdefault("optional-dependencies", {}) and vllm_list =
opt.setdefault("vllm", []) and only append "flashinfer-python" if it's not
already present; also make the sources addition idempotent by checking
sources.get("flashinfer-python") and only set it when missing or different (use
the existing symbols pyproject_path, doc, project, opt, vllm_list, sources, and
the "flashinfer-python" key).
- Around line 26-30: The script currently runs realpath on a target that may not
exist, causing the script to abort under set -e; update the BUILD_DIR assignment
so it does not call realpath on a non-existent path (either use
BUILD_DIR="$SCRIPT_DIR/../3rdparty/flashinfer" or use realpath -m
"$SCRIPT_DIR/../3rdparty/flashinfer") and keep the subsequent existence check
that echoes the error and exits; modify the line that defines BUILD_DIR (and any
usages relying on its absolute value) to use one of these safer alternatives.
Signed-off-by: root <root@gpu-93.slurm-workers-slurm.slurm.svc.cluster.local>
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Signed-off-by: root <root@gpu-93.slurm-workers-slurm.slurm.svc.cluster.local>
…into custom_flashinfer
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
|
@terrykong can you review this PR, it is for building container with custom vllm and flashinfer based on main branch? |
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Signed-off-by: Guyue Huang <140554423+guyueh1@users.noreply.github.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
What does this PR do ?
Add a script to support build custom flash-infer
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit